Crowdsourcing Evaluations of Classifier Interpretability
نویسندگان
چکیده
This paper presents work using crowdsourcing to assess explanations for supervised text classification. In this paper, an explanation is defined to be a set of words from the input text that a classifier or human believes to be most useful for making a classification decision. We compared two types of explanations for classification decisions: human-generated and computer-generated. The comparison is based on whether the type of the explanation was identifiable and on which type of explanation was preferred. Crowdsourcing was used to collect two types of data for these experiments. First, human-generated explanations were collected by having users select an appropriate category for a piece of text and highlight words that best support this category. Second, users were asked to compare humanand computer-generated explanations and indicate which they preferred and why. The crowdsourced data used for this paper was collected primarily via Amazon’s Mechanical Turk, using several quality control methods. We found that in one test corpus, the two explanation types were virtually indistinguishable, and that participants did not have a significant preference for one type over another. For another corpus, the explanations were slightly more distinguishable, and participants preferred the computer-generated explanations at a small, but statistically significant, level. We conclude that computer-generated explanations for text classification can be comparable in quality to human-generated explanations.
منابع مشابه
A Novel Measure for Coherence in Statistical Topic Models
Big data presents new challenges for understanding large text corpora. Topic modeling algorithms help understand the underlying patterns, or “topics”, in data. Researchersauthor often read these topics in order to gain an understanding of the underlying corpus. It is important to evaluate the interpretability of these automatically generated topics. Methods have previously been designed to use ...
متن کاملCrowdsourcing evaluation of high dynamic range image compression
Crowdsourcing is becoming a popular cost effective alternative to lab-based evaluations for subjective quality assessment. However, crowd-based evaluations are constrained by the limited availability of display devices used by typical online workers, which makes the evaluation of high dynamic range (HDR) content a challenging task. In this paper, we investigate the feasibility of using low dyna...
متن کاملLeveraging non-expert crowdsourcing workers for improper task detection in crowdsourcing marketplaces
Controlling the quality of tasks, i.e., propriety of posted jobs, is a major challenge in crowdsourcing marketplaces. Most existing crowdsourcing services prohibit requesters from posting illegal or objectionable tasks. Operators in marketplaces have to monitor tasks continuously to find such improper ones; however, it is very expensive to manually investigate each task. In this paper, we prese...
متن کاملCrowdsourcing for Evaluating Machine Translation Quality
The recent popularity of machine translation has increased the demand for the evaluation of translations. However, the traditional evaluation approach, manual checking by a bilingual professional, is too expensive and too slow. In this study, we confirm the feasibility of crowdsourcing by analyzing the accuracy of crowdsourcing translation evaluations. We compare crowdsourcing scores to profess...
متن کاملLeveraging Crowdsourcing to Detect Improper Tasks in Crowdsourcing Marketplaces
Controlling the quality of tasks is a major challenge in crowdsourcing marketplaces. Most of the existing crowdsourcing services prohibit requesters from posting illegal or objectionable tasks. Operators in the marketplaces have to monitor the tasks continuously to find such improper tasks; however, it is too expensive to manually investigate each task. In this paper, we present the reports of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012